$1245
If my reddit account wasn't suspended for some reason - I'd recommend everyone this model Right no
If my reddit account wasn't suspended for some reason - I'd recommend everyone this model
Right now im using IQ2_M quant with 24k 4bit context and this is the best model i have tried for roleplay(Can't run Behemoth due to its size)
Initially in L2 times i used 7B, 9B and 11B models via exllama. Then tried Mixtral and was fashionated by MoE capabilities and speed so much, that made a lot of frankenmoe merges in L3 times. Then for a month or two stuck to "Free API" CommandR+ and sometimes use it even now
Now bought myself a P40 and with my weird 36GB VRAM RTX3060+P40 setup decided to try big models, initially some from Sao but his models were too focused on eRP(or that's just my settings were shitty), then tried Magnum 72B V2 and was amazed for it's quality, when Magnum 72B V4 came out i sticked to it for some time before trying Nemotron 70B and it's finetunes like Nautilus and Sunfall. Honestly, wasn't impressed by Nautilus but Sunfall was great and amazed me almost every day, so it became my daily driver
However even IQ2_M quant of Endurance 100B amazed me by it's quality, first, it has much higher emotional intelligence, i haven't seen a model better for a good drama. Second, it's capability at remember things, i often want to see how LLM understands what's happening in the roleplay and write something like "Stop the roleplay, analysis it, write an essay, split it at subtopics" or "Stop the roleplay, analysis { { user}}, write an essay, split it at subtopics". CommandR+ was the only one who could break out of character and write analysis without any crutches on first try, Nemotron and it's finetunes were also great at this, however Endurance's analysis was something else, not only first try breaking out of character but also it's analytical capabilities were the best among every model i tried for RP purposes... and im not even sure im using fitting settings.
The only "bad" thing i can say is that my speed dropped by ~1t/s compared to ~70B models, but that was expected, if using row split wouldn't turn my output into garbage maybe that would be better, anyway I'll try row split later, koboldcpp updated many times so i have hope
Well, what can i say... Bravo, just bravo
","updatedAt":"2024-12-04T17:24:28.293Z","author":{ "_id":"64f5e51289c121cb864ba464","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg","fullname":"XorMix","name":"xxx777xxxASD","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":76}},"numEdits":4,"identifiedLanguage":{ "language":"en","probability":0.9787644147872925},"editors":["xxx777xxxASD"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg"],"reactions":[],"isReport":false}},{ "id":"67508fca03f3dee712b9f2e9","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-04T17:22:18.000Z","type":"comment","data":},"numEdits":0,"identifiedLanguage":{ "language":"en","probability":0.9719435572624207},"editors":["Ainonake"],"editorAvatarUrls":["/avatars/80eb489f00cf499ab4d87ff349102222.svg"],"reactions":[],"isReport":false}},{ "id":"6750928f1b5de64e605f06dc","author":{ "_id":"64f5e51289c121cb864ba464","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg","fullname":"XorMix","name":"xxx777xxxASD","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":76,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-04T17:34:07.000Z","type":"comment","data":{ "edited":false,"hidden":false,"latest":{ "raw":"Probably you wouldn't be able to run 24k context with 1GB of VRAM less, try 16k with koboldcpp or run 18~20k context by using llamacpp","html":"
Probably you wouldn't be able to run 24k context with 1GB of VRAM less, try 16k with koboldcpp or run 18~20k context by using llamacpp
","updatedAt":"2024-12-04T17:34:07.353Z","author":{ "_id":"64f5e51289c121cb864ba464","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg","fullname":"XorMix","name":"xxx777xxxASD","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":76}},"numEdits":0,"identifiedLanguage":{ "language":"en","probability":0.8902291059494019},"editors":["xxx777xxxASD"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg"],"reactions":[],"isReport":false}},{ "id":"6750a04b0731518176bf22a1","author":{ "_id":"64f5e51289c121cb864ba464","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg","fullname":"XorMix","name":"xxx777xxxASD","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":76,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-04T18:32:43.000Z","type":"title-change","data":{ "from":"Awesome","to":"Feedback"}},{ "id":"6750d55de6b04f794ab9a5e4","author":{ "_id":"64f5e51289c121cb864ba464","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg","fullname":"XorMix","name":"xxx777xxxASD","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":76,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-04T22:19:09.000Z","type":"comment","data":},"numEdits":0,"identifiedLanguage":{ "language":"en","probability":0.949385941028595},"editors":["xxx777xxxASD"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg"],"reactions":[],"isReport":false}},{ "id":"6751125b5d873b8ed243cd4c","author":{ "_id":"64f5e51289c121cb864ba464","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg","fullname":"XorMix","name":"xxx777xxxASD","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":76,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-05T02:39:23.000Z","type":"comment","data":},"numEdits":1,"identifiedLanguage":{ "language":"en","probability":0.9800531268119812},"editors":["xxx777xxxASD"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64f5e51289c121cb864ba464/xCk2OvOA_CsO0Xym-NIDO.jpeg"],"reactions":[],"isReport":false}},{ "id":"6751608914bae121c8ecfd0f","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-05T08:12:57.000Z","type":"comment","data":},"numEdits":1,"identifiedLanguage":{ "language":"en","probability":0.9674176573753357},"editors":["Ainonake"],"editorAvatarUrls":["/avatars/80eb489f00cf499ab4d87ff349102222.svg"],"reactions":[],"isReport":false}},{ "id":"6751640447111c6a7bcfca59","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-05T08:27:48.000Z","type":"comment","data":{ "edited":false,"hidden":false,"latest":{ "raw":"even with qv cache it doesn't work.
Hm, but it runs relatively okay on cpu. I will try IQ2_XS on gpu and try original behemoth/this model with higher quant on cpu.","html":"
even with qv cache it doesn't work.
Hm, but it runs relatively okay on cpu. I will try IQ2_XS on gpu and try original behemoth/this model with higher quant on cpu.
","updatedAt":"2024-12-05T08:27:48.722Z","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4}},"numEdits":0,"identifiedLanguage":{ "language":"en","probability":0.9419592618942261},"editors":["Ainonake"],"editorAvatarUrls":["/avatars/80eb489f00cf499ab4d87ff349102222.svg"],"reactions":[],"isReport":false}},{ "id":"67519347efca01b92289811e","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-05T11:49:27.000Z","type":"comment","data":{ "edited":false,"hidden":false,"latest":{ "raw":"IQ2_XS is smart on first glance. Will compare with q4km behemoth.","html":"
IQ2_XS is smart on first glance. Will compare with q4km behemoth.
","updatedAt":"2024-12-05T11:49:27.650Z","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4}},"numEdits":0,"identifiedLanguage":{ "language":"en","probability":0.8929060101509094},"editors":["Ainonake"],"editorAvatarUrls":["/avatars/80eb489f00cf499ab4d87ff349102222.svg"],"reactions":[],"isReport":false}},{ "id":"67519d667679a2657e78d466","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-05T12:32:38.000Z","type":"comment","data":{ "edited":true,"hidden":false,"latest":{ "raw":"Non-english perfomance is nuked btw (by english-calibrated low imatrix quant ot by layers removal)","html":"
Non-english perfomance is nuked btw (by english-calibrated low imatrix quant ot by layers removal)
","updatedAt":"2024-12-05T12:32:55.078Z","author":{ "_id":"64639c4ed4d34b01f45e0109","avatarUrl":"/avatars/80eb489f00cf499ab4d87ff349102222.svg","fullname":"No Name","name":"Ainonake","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4}},"numEdits":1,"identifiedLanguage":{ "language":"en","probability":0.9354568123817444},"editors":["Ainonake"],"editorAvatarUrls":["/avatars/80eb489f00cf499ab4d87ff349102222.svg"],"reactions":[],"isReport":false}},{ "id":"67520a973183e008e8348799","author":{ "_id":"66fb02e1e8f49eec687a7537","avatarUrl":"/avatars/81a1c50c6a488e44f8d72a0cf6019560.svg","fullname":"Ju Tsu","name":"ShadowClone","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-05T20:18:31.000Z","type":"comment","data":{ "edited":false,"hidden":false,"latest":{ "raw":"Great model, I think this is finally the nail on the MidnightMiqu coffin for 48gb vram. Even though I've only tested it with one card, it feels like a smarter version of MidnightMiqu that retains its creativity while picking up on more nuances ","html":"
Great model, I think this is finally the nail on the MidnightMiqu coffin for 48gb vram. Even though I've only tested it with one card, it feels like a smarter version of MidnightMiqu that retains its creativity while picking up on more nuances
","updatedAt":"2024-12-05T20:18:31.333Z","author":{ "_id":"66fb02e1e8f49eec687a7537","avatarUrl":"/avatars/81a1c50c6a488e44f8d72a0cf6019560.svg","fullname":"Ju Tsu","name":"ShadowClone","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false}},"numEdits":0,"identifiedLanguage":{ "language":"en","probability":0.9347717761993408},"editors":["ShadowClone"],"editorAvatarUrls":["/avatars/81a1c50c6a488e44f8d72a0cf6019560.svg"],"reactions":[],"isReport":false}},{ "id":"6754c8a2bf0ac39a7ad52ddf","author":{ "_id":"6317bc40b58b01846312b986","avatarUrl":"/avatars/7f599322273aa47835a334a30597e652.svg","fullname":"Olaf Kowalsky","name":"Olafangensan","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isOwner":false,"isOrgMember":false},"createdAt":"2024-12-07T22:13:54.000Z","type":"comment","data":{ "edited":true,"hidden":false,"latest":{ "raw":"Just finished a somewhat complex scenario with dubious moral implications (the best kind).
90% of the models will either have the character screaming and yelling at you like some r*pe victim(is not by a mile) or go full S E X O on it. I have not tried Midnight Miqu yet, but I have a feeling it would just spiral into a full-blown soap opera kind of drama.
Saying that...this model did none of that. The sheer emotional intelligence took me by complete surprise - the girl, whose character is wild and aggressive, slowly softened as a real connection formed between her and her fiancé as they spent the night going against her parents' wishes and just cuddling to sleep.
I love how genuinely graceful the slope of her emotional progression was. She didn't give in right away, nor did she fight until blood came out (it did happen the next morning though, lowering the temperature worked wonders to fix it). At some point I even added some side characters and it held up marvelously!
There was even a lot of playing around with the slop, reforming it to fit the context of the scene. Love it!
This is now my favorite model, no questions asked. The dream of running it locally is the only thing left before this becomes a genuinely useful writing assistant for me.","html":"
Just finished a somewhat complex scenario with dubious moral implications (the best kind).
90% of the models will either have the character screaming and yelling at you like some r*pe victim(is not by a mile) or go full S E X O on it. I have not tried Midnight Miqu yet, but I have a feeling it would just spiral into a full-blown soap opera kind of drama.
Saying that...this model did none of that. The sheer emotional intelligence took me by complete surprise - the girl, whose character is wild and aggressive, slowly softened as a real connection formed between her and her fiancé as they spent the night going against her parents' wishes and just cuddling to sleep.
I love how genuinely graceful the slope of her emotional progression was. She didn't give in right away, nor did she fight until blood came out (it did happen the next morning though, lowering the temperature worked wonders to fix it). At some point I even added some side characters and it held up marvelously!
There was even a lot of playing around with the slop, reforming it to fit the context of the scene. Love it!
This is now my favorite model, no questions asked. The dream of running it locally is the only thing left before this becomes a genuinely useful writing assistant for me.
","updatedAt":"2024-12-08T15:49:04.124Z","author":{ "_id":"6317bc40b58b01846312b986","avatarUrl":"/avatars/7f599322273aa47835a334a30597e652.svg","fullname":"Olaf Kowalsky","name":"Olafangensan","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3}},"numEdits":2,"identifiedLanguage":{ "language":"en","probability":0.983838677406311},"editors":["Olafangensan"],"editorAvatarUrls":["/avatars/7f599322273aa47835a334a30597e652.svg"],"reactions":[],"isReport":false}},{ "id":"67552f5af4e525e6434450cc","author":{ "_id":"65f2fd1c25b848bd061b5c2e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65f2fd1c25b848bd061b5c2e/LI9oyVM4OM7jP9dlLeiu_.jpeg","fullname":"Drummer","name":"TheDrummer","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1684,"isOwner":true,"isOrgMember":false},"createdAt":"2024-12-08T05:32:10.000Z","type":"comment","data":{ "edited":false,"hidden":false,"latest":{ "raw":"@Olafangensan Thank you! If you're not running it locally, you might want to run Behemoth 123B instead.","html":"
Thank you! If you're not running it locally, you might want to run Behemoth 123B instead.
","updatedAt":"2024-12-08T05:32:10.317Z","author":{ "_id":"65f2fd1c25b848bd061b5c2e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65f2fd1c25b848bd061b5c2e/LI9oyVM4OM7jP9dlLeiu_.jpeg","fullname":"Drummer","name":"TheDrummer","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1684}},"numEdits":0,"identifiedLanguage":{ "language":"en","probability":0.8434226512908936},"editors":["TheDrummer"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/65f2fd1c25b848bd061b5c2e/LI9oyVM4OM7jP9dlLeiu_.jpeg"],"reactions":[],"isReport":false}}],"pinned":false,"locked":false,"collection":"discussions","isPullRequest":false,"isReport":false},"repo":{ "name":"TheDrummer/Endurance-100B-v1","type":"model"},"activeTab":"discussion","discussionRole":0,"watched":false,"muted":false}">